-
Notifications
You must be signed in to change notification settings - Fork 12.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[llvm][Support] fix convertToSnakeFromCamelCase #68375
Conversation
@llvm/pr-subscribers-llvm-support @llvm/pr-subscribers-llvm-adt ChangesCurrently runs of caps aren't handled correctly so e.g. something like Full diff: https://github.com/llvm/llvm-project/pull/68375.diff 2 Files Affected:
diff --git a/llvm/lib/Support/StringExtras.cpp b/llvm/lib/Support/StringExtras.cpp
index 5683d7005584eb2..fd5a34fb3d6e82c 100644
--- a/llvm/lib/Support/StringExtras.cpp
+++ b/llvm/lib/Support/StringExtras.cpp
@@ -12,6 +12,7 @@
#include "llvm/ADT/StringExtras.h"
#include "llvm/ADT/SmallVector.h"
+#include "llvm/Support/Regex.h"
#include "llvm/Support/raw_ostream.h"
#include <cctype>
@@ -96,18 +97,13 @@ std::string llvm::convertToSnakeFromCamelCase(StringRef input) {
if (input.empty())
return "";
- std::string snakeCase;
- snakeCase.reserve(input.size());
- for (char c : input) {
- if (!std::isupper(c)) {
- snakeCase.push_back(c);
- continue;
- }
-
- if (!snakeCase.empty() && snakeCase.back() != '_')
- snakeCase.push_back('_');
- snakeCase.push_back(llvm::toLower(c));
+ std::string snakeCase = input.str();
+ for (int i = 0; i < 10; ++i) {
+ snakeCase = llvm::Regex("([A-Z]+)([A-Z][a-z])").sub("\\1_\\2", snakeCase);
+ snakeCase = llvm::Regex("([a-z0-9])([A-Z])").sub("\\1_\\2", snakeCase);
}
+ std::transform(snakeCase.begin(), snakeCase.end(), snakeCase.begin(),
+ [](unsigned char c) { return std::tolower(c); });
return snakeCase;
}
diff --git a/llvm/unittests/ADT/StringExtrasTest.cpp b/llvm/unittests/ADT/StringExtrasTest.cpp
index 3f69c91b270a355..fab562f1ed0d594 100644
--- a/llvm/unittests/ADT/StringExtrasTest.cpp
+++ b/llvm/unittests/ADT/StringExtrasTest.cpp
@@ -184,6 +184,11 @@ TEST(StringExtrasTest, ConvertToSnakeFromCamelCase) {
testConvertToSnakeCase("OpName", "op_name");
testConvertToSnakeCase("opName", "op_name");
+ testConvertToSnakeCase("OPName", "op_name");
+ testConvertToSnakeCase("opNAME", "op_name");
+ testConvertToSnakeCase("opNAMe", "op_na_me");
+ testConvertToSnakeCase("opnameE", "opname_e");
+ testConvertToSnakeCase("OPNameOPName", "op_name_op_name");
testConvertToSnakeCase("_OpName", "_op_name");
testConvertToSnakeCase("Op_Name", "op_name");
testConvertToSnakeCase("", "");
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
32422a2
to
d131ddb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with some minor comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this version a lot better than the regex :)
LGTM % nits
278560c
to
3ef9905
Compare
3ef9905
to
6bf2dc2
Compare
This PR adds the additional generation of what I'm calling "value builders" (a term I'm not married to) that look like this: ```python def empty(sizes, element_type, *, loc=None, ip=None): return get_result_or_results(tensor.EmptyOp(sizes=sizes, element_type=element_type, loc=loc, ip=ip)) ``` which instantiates a `tensor.EmptyOp` and then immediately grabs the result (`OpResult`) and then returns that *instead of a handle to the op*. What's the point of adding these when `EmptyOp.result` already exists? My claim/feeling/intuition is that eDSL users are more comfortable with a value centric programming model (i.e., passing values as operands) as opposed to an operator instantiation programming model. Thus this change enables (or at least goes towards) the bindings supporting such a user and use case. For example, ```python i32 = IntegerType.get_signless(32) ... ten1 = tensor.empty((10, 10), i32) ten2 = tensor.empty((10, 10), i32) ten3 = arith.addi(ten1, ten2) ``` Note, in order to present a "pythonic" API and enable "pythonic" eDSLs, the generated identifiers (op names and operand names) are snake case instead of camel case and thus `llvm::convertToSnakeFromCamelCase` needed a small fix. Thus this PR is stacked on top of #68375. In addition, as a kind of victory lap, this PR adds a "rangefor" that looks and acts exactly like python's `range` but emits `scf.for`.
Currently runs of caps aren't handled correctly so e.g. something like
Intel_OCL_BI
is snake cased tointel_o_c_l_b_i
(previously discussed on this phabricator patch).